Data Visualization Project 02

library(GGally)
library(ggthemes)
library(plotly)
library(sf)
library(tidyverse)
weather_raw <- read_csv("../data/atl-weather.csv")
## Rows: 365 Columns: 40
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr   (3): summary, icon, precipType
## dbl  (29): moonPhase, precipIntensity, precipIntensityMax, precipProbability...
## dttm  (8): time, sunriseTime, sunsetTime, precipIntensityMaxTime, temperatur...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
weather_temps <- weather_raw |>
  pivot_longer(cols = c(temperatureLow,
                        temperatureHigh),
               names_to = "temperatureType",
               values_to = "temperature")

The raw data is shown with a thin line in the background. The thicker lines show a smoothed version of the data, which makes it much easier to view the average temperature throughout the year.

temps_plot <- ggplot(weather_temps,
                     aes(x = time,
                         y = temperature,
                         color = temperatureType)) +
  geom_line(linewidth = 0.1) +
  geom_smooth(se = FALSE) +
  guides(color = "none") +
  labs(title = "Daily high and low temperatures in Atlanta",
       subtitle = "January 1, 2019 - January 1, 2020",
       x = NULL,
       y = "Fahrenheit") +
  theme_hc()

ggplotly(temps_plot)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

Humidity, cloud cover, and visibility are all closely related. Humidity has a strong effect on visibility separately from any relation to cloud cover as well.

weather_stats <- weather_raw |>
  select(cloudCover, dewPoint, humidity, visibility)
ggpairs(weather_stats, progress = FALSE)

lakes <- read_sf("../data/Florida_Lakes/Florida_Lakes.shp")
counties <- read_sf("../data/Florida_Counties/Florida_Counties.shp")
lakes$type <- "water"
counties <- counties |>
  mutate(type = ifelse(COUNTYNAME == "POLK", "highlight", "land"))

polk <- counties |>
  filter(COUNTYNAME == "POLK")
box <- st_bbox(polk$geometry)
center <- c(x = (box["xmin"] + box["xmax"]) / 2,
            y = (box["ymin"] + box["ymax"]) / 2)

The original goal was for this map to be interactive using plotly, but it would run out of memory when trying to save the plot. Polk County is highlighted and annotated on the map, but no data is presented beyond coloring in lakes and drawing outlines of each county.

ggplot() +
  geom_sf(data = counties,
          mapping = aes(fill = type),
          linewidth = 0.1) +
  geom_sf(data = lakes,
          mapping = aes(fill = type),
          linewidth = 0.1) +
  geom_text(aes(x = center[1],
                y = center[2],
                label = "Polk County"),
            color = "black",
            size = 4) +
  scale_fill_manual(values = c("land" = "green",
                               "water" = "blue",
                               "highlight" = "red")) +
  guides(fill = "none") +
  labs(title = "Map of Florida",
       x = "",
       y = "") +
  theme_hc()